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1 .  INTRODUCTION  AND  MAIN  RESULTS 


Suppose  that  X  is  a  one- dimensional  random  variables  with  distribution 
F  and  density  f.  Let  X.|,...,X  be  i.i.d.  observations  of  X.  It  is  desired 
to  estimate  f(x)  by  these  samples.  For  this  purpose  we  introduce  a  parti¬ 
tion  of  R^  =  C-“ ,°° )  •' 

R1  =  U  °°  IC1  ;X, . Xn) ,  (1) 

i  =1  1  n 

where  I(i ;X^ ,Xn),  i=l,2,...  are  intervals  with  lengths  greater  than 
zero,  and  I(i  ;X^ .  ,Xn)  P  I(  j  ;X^ ... .  ,Xn)  =  0  for  i  f  j.  Write 

Kn  *  Kn^Xl*“**Xn^  =  {I^1;X1 . V:  1  = 

I  (x)  *  The  interval  I(i ;X-j ,Xn)  containing  x, 

Pn(x)  »  #({i:l<i<n,  X.  e  I^Cx)}),  where  #(A)  denotes  the  number 
of  elements  belonging  to  A, 
and  define  an  estimate  of  f(x)  as  follows: 


fn(x)  =  fn(x;Xr...,Xn)  = 


Pn(x)/(n |ln(x) | ) . 


(2) 


Here  and  in  the  following  we  write  }  A  j  for  the  Lebesgue  measure  of  the  set 
A  c  R1 .  fn(x)  the  so-called  data-based  histogram  estimate  based  on  the 
partition  K^.  "Data-based"  means  that  depends  on  the  sample  X^,...,Xn, 
while  in  the  ordinary  histogram  estimate,  the  partition  is  predetermined 
before  the  samples  were  drawn. 

Write  A  *  A  (X,,...,X  )  for  the  L,-norm  of  f  : 
n  n  i  n  i  n 


A 


n 


|  I f  (x)  -  f ( x) | dx  - 

*  aOD 


f  |  dx. 


(3) 


A  number  of  papers  appeared  dealing  with  the  weak  (i.e.  in  probability) 
convergence  of  A^  to  zero.  Among  these  we  mention  the  recent  paper  [2]  by 

p 

J.  Chen  and  H.  Rubin,  in  which  they  prove  that  A  — ►  0  under  quite  general 
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conditions  imposed  on  Kn.  In  the  present  article  we  deal  with  the  problem 
of  a.s.  convergence.  Specifically  speaking,  we  prove  the  following  theorem. 

Theorem  1 .  Suppose  that  Kn  satisfies  the  following  two  conditions: 

1.  1  im 1 1  (x) |  =  0,  a.s.  for  x  e  r\  a.e.L.  (4) 

n-H»  n 

2.  Denote  by  C  t  the  number  of  intervals  in  Kn  having  at  least  one 
common  point  with  [-t,t],  we  have  for  any  fixed  t  >  0, 

C  =  o(n/log  n),  a.s.  (5) 

Then,  for  the  estimator  f  defined  by  (2)  it  is  true  that 

lim  A  =  0,  a.s.  (6) 

n-*» 

where  A  is  the  L.-norm  of  f  defined  by  (3). 
n  l  n 

p 

Essentially  speaking,  Chen  and  Rubin  proved  that  An — -*■  0  under  our 
condition  1  and  Cnt  =  op(n),  and  another  condition  with  a  more  complicated 
nature.  In  order  to  prove  (6)  we  pay  a  price  that  the  condition  C  .  =  o  (n) 

ill* 

is  strengthend  to  (5).  It  is  easy  to  show  by  an  example  that  (5)  cannot  be 
replaced  by  Cnt  =  op(n/log  n).  Judging  from  the  known  results  in  density 
estimation,  it  seems  doubtful  that  the  condition  (5)  can  be  substantially 
improved. 

We  also  remark  that  Y.  S.  Chow  and  others  [3]  gave  a  result  concerning 
the  truth  of  (6),  where  Kn  has  a  form  «n  =  { (i /X  ,( i +1 )/X  ,  i  =  0,+l,+2,...) 
*n  *  Xn^l’*’*’*n^  determined  by  the  sample  in  a  way  described  in  [3], 
but  the  condition  imposed  on  f  is  rather  stringent. 

Chen  and  Rubin  also  considered  in  [2]  the  case  that  X  is  mul tidimension' 
al,  again  for  weak  convergence.  For  the  problem  of  strong  convergence,  Wang 
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and  Chen  [8]  obtained  some  results  which  are  of  more  complicated  form. 
Recently,  we  improved  these  results  and  obtain  a  simple  one  which  includes 
the  above  one-dimensional  result  as  a  special  case. 

In  this  paper  we  give  a  proof  in  detail  for  one-dimensional  case  only. 
In  Section  2  we  introduce  a  lemma  which  is  needed  in  the  sequel.  The  proof 
of  the  main  result  is  given  in  Section  2.  In  Section  3  we  discuss  a  general 
ization  to  the  multidimensional  case. 


r.  > 


V’  v 
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2 .  A  LEMMA 


The  proof  of  Theorem  1  depends  on  a  lemma  which  is  proved  in  this 
section.  The  lemma  has  independent  interest,  and  may  be  useful  in  some 
related  problems. 

Lemma  1.  Suppose  that  X^X^...  is  a  sequence  of  independent  one-dimen¬ 
sional  random  variables,  and  the  distribution  function  F  of  X^  is  continuous 
everywhere  on  r\  Denote  by  Fnthe  empirical  distribution  function  of 
(X^,...,X  }.  Then  there  exist  absolute  constants  C.  >  0,  i  =  0,1,..., 4, 
such  that  for  any  e  >  0  we  have 


P{ s up  |  F  (A)  -  F( A)  1  >  e> 


1  C^—  +  ^)exp(-C2ne2/b)  +  C3exp(-C4nc) 
eV'n 


where  F  is  a  set  consisting  of  some  intervals  Ac  R  with 

sup  F(A-)  _<  b  ^  1 
AeF 

and  n/log  n  is  greater  than  CQ/e. 

Proof.  By  a  result  of  Komi  os.  Major  and  Tusnady  [6]  ,  we  can  find  a  suitable 
probability  space  in  which  we  can  define  an  i.i.d.  sequence  (which  for  con¬ 
venience  will  be  denoted  again  by  X^,...,Xn)  whose  common  distribution  is 
F,  and  a  Brownian  bridge  Bn(t),  such  that 

P(sup|n(Fn(x)  -  F( x) )  -  '/nBn(F(x))j  >  C  log  n  +  y} 

<  C  exp(-Xy) ,  (8) 

where  C,  C  and  \  are  positive  absolute  constants.  Put  C t )  =  Wn C t )  -  tWn ( 1 ) , 

where  is  a  Brownian  motion  process,  and  write 
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Then  for 


From  (8), 


Further, 


By  Lemma 


where  C  i 


n(Fn(x)  -  F(x) )  =  ^nBnCF(x))  +  en(x). 

,  =  [a1  ,a2)  €  F  we  have 

FnlA)  -  FCA)  =  n‘1/2[WnCFCa2))  -  W^Fta,))] 

-  n-'1/2  F(A)WnCl)  +  n_1ten(a2)  -  enCa] )J  .  C9) 

when  ne/6  >  2C  log  n  we  have 

P{sup  l(e  (a-)  -  e  (a, ) |  >  e/3} 

AeF  n  n  L  n  1 

£  P{sup(en(x)  |  _>  ne/6}  <  C  exp(-xne/l 2) .  0°) 


P ( s up  n"1/2F(A)|W  (1)|  >  e/3) 
AeF  n 


<  P(|Wn(l)|  >  e4T/(3b)) 


6b 

<  — - 

/?7T  E 


ne2^) 


18b 


T 


/2ir  e  /n 


exp(’T5F)  * 


(ll) 


1,2.1  in  |*4]  and  sup  F(A)  _<  b, 

AeF 

P{sup  n'1/2|Wn(F(a2))  -  Wn(F(a1))|  >e/3} 


<  P{  sup  I Wn C x2 )  -  W  (x^)]  ie,/n/3> 

x2«x^b,  0£X.j<x2<1 

<_  P(  sup  sup  |W  (s+t)  -  W  (s)(  _>  (.e/n/3/i b)/b} 
o<s<2-b  o<t<b  ! 


<_  ^  exp(-ne2/18b) , 


(12) 


a  constant.  The  lemma  now  follows  from  (9)-(12). 


Remark.  The  lemma  is  an  essential  improvement  of  a  similar  result  given 
by  Devroye  and  Wagner  in  [5]  for  the  special  case  that  F  consists  of  one 
dimensional  intervals.  In  their  result,  b  is  given  by 


3.  PROOF  OF  THE  THEOREM  1 


First  we  note  that  it  is  enough  to  show  that 
rt 


Tim 

n-*» 


-t 


If  -  f  !  dx  =  0,  a.s.  for  each  t  >  0. 
n 


03) 


For  if  (13)  has  been  proved,  denote  by  E^  c  R  the  set  on  which  (13)  is  not 


true.  Then  P(E.)  =  0.  Put  E  =  y”  E..  By  an  easy  argument  it  is  seen  that 
z  t=l 


f  i 


lim  f  -  f  dx+  0  on  R  -  E. 

J  n 


n-**> 


Next  define 


Q„(x)  - 


f Cu) du/ [ I  (x)  I . 

I  Cx) 


In  order  to  prove  (13)  it  is  enough  to  prove  that 
t 


1  im 
n-*» 


f  -  Q  dx  =  0,  a.s.  for  each  t  >  0, 
n 


(14) 


lim 

n-*» 


-  Qn | dx  =  0,  a.s.  for  each  t  >  0. 


(15) 


By  assumption  1  of  Theorem  1  it  is  easily  seen  that  there  exists  a  set 

Ac  r”  such  that  P(A)  =  0  and  for  (X^  .X^,...)  e”  A  we  have  1  i  m  |  IR(x)  I  =  0 

n-*» 

for  x  e  r"*  ,  a.e.L,  and  in  turn  it  follows  that  lim  Qn(x)  =  f(x)  for  x,  a.e.L. 

n-*» 

Since  Qn(x)  is  a  density  function  when  (X^.X^,...)  is  fixed,  by  a  well-known 


theorem  due  to  Scheffe,  it  follows  that  lim 
„  n-*» 

(Xj.Xg,...)  €  A.  Thus  (14)  is  proved. 


Q  | dx  =0  for  each  fixed 

Mn 1 


Now  we  proceed  to  prove  (15).  Denote  by  An-|,...,Anm  those  intervals 

n 

belonging  to  «n  and  having  conmon  points  with  [-t,t].  Put 


and  denote  by  F(b)(0  <_  b  <_  1)  the  set  of  intervals  belonging  to  F  and 

m 

satisfying  F(I)  <_  b.  Then  j  n  A  ■  D  [-t,tj  ,  and  by  Assumption  2  of 


Theorem  1  we  have 


mn  =  o(n/log  n) ,  a.s. 


Denote  by  #(A)  the  number  of  elements  belonging  to  A,  and 


qni  =  #(  (j:  1  <  J  1  n,  €  Anf  }), 


Then  we  have 


|  f  -  Q  I  dx  <  y,  ,  j-  If  -  Q  I  d> 
'  n  yn'  -  M  =1  I  n  ^n1 

■t  ' 


i  in 

1  r  n  |  T  i 

n  L.  -j 1  m  1 


Given  e  >  0,  choose  M  >  max{64,  (C  (-p^-O3^2,  ( — ^-*-)3},  where  C  , 

o  ^  t  o 

C2,  C4  are  the  constants  mentioned  in  Lemma  1.  Divide  {l,2,...,m  }  into  a 
number  of  nonintersecting  sets  J  in  the  following  way: 

Jq  =  Ci  :  1  _<  i  <_  mn ,  Zni  <_  M  log  n/n) , 

Jr  =  f 1  -*  1  1  1  1  mn»  M  +  „  "  1  ^g  n  <  Z^.  <  log  n) ,  r  =  1 ,2... 


Define  a^  =  # ( J ^ ) »  i  =  0,1,2,...  .  Since  fdx  =  1 ,  we  have 

00 

T  a.(M  +  i  -  1)  log  n/n  <  1.  From  this  and  M  >  1  we  have 
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J  a. (M  +  1 )  1  og  n/n  _<  2. 
i  =1  1 


By  (M  +  r  -  1 ) 1 og  n/n  <  1 ,  we  can  restrict  ourselves  to  the  cases  where 
r  <  n/log  n.  Thus 


where 


Wri  te 


,  m  n/log  n 

1  I  "  |q  •  -  nZ  .)  =  l 
n  L .  , 1  m  1  L  n 

i=l  r=o 


1  n 


n/log  n 

<ll  a,  sup  |F  (!)  -  F(I)| , 

"  r=o  '  UF  " 


Fr  =  F((M  +  r)log  n/n),  r  >_  0, 


B  =  {sup  |  F  ( I )  -  F(I)  |  >  f.M  log  n/n  , 

IeF„  n 
o 

B  =  {sup  |  F  ( I )  -  F( I)  |  >  eM"1/3(M  +  r)  log  n/n),  r  >  1, 
"  l€Fr  n 


n/log  n 

B  =  (J  B  .  Using  Lemma  1,  we  get 

r=o  “ 

P(6°)iC’(^f:itin^lexp('c2e2M  109 

+  exp(-  log  n) , 


P(Br)  <_  ^(7  M“1/,6(log  n)"1^2  +  n/M  log  n)expt-C,,e2M^3log  n) 


+  C3  exp(-  C4M2/3e  log  n), 


(22) 


which  implies 

P(B)  <  C/n2, 

where  C  does  not  depend  on  n. 

But  when  the  event  B  does  not  happen,  we  have 
m 


i  l  "  |q„,  -  nZ  .  |  <  <,  *  I  a/Wtr>''°9  " 

n  L •  1  m  m1  —  o  n  j  r  n 

By  (19)  and  M  >  64  it  follows  that 

i  m  o 

P(W  S."  lqni  "  nZni  I  1  a0eM  109  n/n  +  e/2)  -  PtB)  -  C/n  “ 
Hence  by  Borel-Cantel  1  i '  s  Lemma  we  have 

l  mn 

?(•—  I  n  |Qni  -  nZ  |  >  a  cM  log  n/n  +  e/2,  i.o.)  -  0. 
n  "1“^  "  ni  —  o 

Therefore,  with  probability  one,  we  can  assert  that 


1  v  n 


I  ^  I  Qni  -  nZni  |  <  aQeM  log  n/n  +  e/2 


(23) 


for  n  sufficiently  large.  But  by  (16) 

aQ  £  mn  =  o(n/log  n),  a.s. 

From  this  and  (23),  it  follows  that  with  probability  one,  we  have 


m 

n 

i  =1 


IQ, 


m 


"Z„1 I  i  * 


(24) 


for  n  sufficiently  large.  Since  e  >  0  is  arbitrarily  given,  (15)  follows 
from  (18)  and  (24),  and  Theorem  1  is  proved. 


n 


4.  MULTIDIMENSIONAL  CASE 

We  now  consider  multidimensional  extension  of  the  result  in  Section  3. 

Let  us  assume  that  X,  ,  ....  Xp  are  i.i.d.  d-dimensional  random  vectors, 

and  replace  density,  partition,  interval  in  R^  and  so  on  by  the  analogues 

in  R^.  In  particular,  by  an  interval  in  R^  we  mean  a  set  in  having  the 

d 

form  n  A.,  where  A. 's  are  all  one-dimensional  intervals.  Now  C  .  in  (5) 
1-1  1  1  nt 
is  defined  as  the  number  of  intervals  in  Kn  having  at  least  one  common  point 

with  the  interval  Vt  -  {(x-j ,. . .  ,xd)  :  |x.  |  <  t,  i  =  1 . d}.  Also,  condi¬ 

tion  (4)  is  replaced  by  the  following: 

(I)  lim  D(I  (x ) )  =  0,  a.s.  for  x  e  Rd,  a.e.L,  (25) 

n-K» 

where  D(I)  denotes  the  diameter  of  set  ic  R^. 

p 

For  the  case  where  d  >  1,  Chen  and  Rubin  [2]  proved  that  0  under 

(25),  Cnt  =  op(/n)  for  any  t  >  0,  and  another  condition  with  a  more  compli¬ 
cated  nature.  Wang  and  Chen  OB]  studied  the  problem  of  strong  convergence 
of  &n.  They  proved  that  lim  =  0,  a.s.  if  (25)  holds  and  one  among 

IT-*” 

the  following  sets  of  conditions  is  satisfied: 

II1.  Cnt  =  o(i/n/Tog  n) ,  a.s.  for  any  t  >  0, 

j 

II".  f  is  bounded  on  any  bounded  subset  of  R  , 

Cnt  *  o(n/log  n),  a.s.  for  any  t  >  0, 

lim  sup  an(t)  <  «,  a.s.  for  any  t  >  0, 
n^» 

where 

an( t)  =  sup{D( I )  :  I  e  Kn  and  I  nv^  0>. 

II"'.  For  a  >  0  large  enough,  the  set  {x  :  f(x)  <  a)  differs  only  by 
a  null  Lebesgue  measurable  set  from  an  open  set, 

C  t  =  °(n/log  n),  a.s.  for  any  t  >  0, 

lim  an(t)  =  0,  a.s.  for  any  t  >  0. 
n-*» 
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Recently,  we  find  an  inequality  by  which  we  obtain  the  following. 


Theorem  2.  Suppose  that  Kn  satisfies  the  condition  (I)  and 
(II)  Cnt  *  o(n/log  n),  a.s.  for  any  t  >  0. 

Then  (6)  is  true  for  the  d-dimensional  case. 

Since  the  proof  is  parallel  to  that  of  Theorem  1,  we  only  introduce 
related  inequality. 

To  this  end,  let  x^,...,xr  be  r  points  in  Rd,  and  A  be  a  class  of  Borel 

sets  in  Rd.  Denote  by  AA(x^ , . . . ,xr)  the  number  of  distinct  sets  in 

{{x1  ,...,x  >  H  A,  A  e  A}.  Define 

mA(r)  =  max  d  AA(x^ ,. . . ,xr) . 

X ^  |  »  |X^6R 

A  p 

Vapnik  and  Chervonenkis*  [7]  showed  that  either  m  (r)  ■  2  for  any  positive 
integer  r  or  mA(r)  _<  rs  +  1 ,  where  s  is  the  smallest  k  such  that  mA(k)  f  2k. 

A  class  of  sets  A  for  which  the  latter  case  holds  will  be  called  a  V-C  class 

with  index  s. 

Suppose  that  y  is  a  probability  measure  on  Rd.  Let  X^.Xg,...  be  a 
sequence  of  i.i.d.  random  vectors  with  common  distribution  y,  and  yn  be  the 
empirical  distribution  of  X^,...,Xn>  Denote  a  "distance"  between  y^  and  y 
by 


D_ (A,y )  =  Sup|y  (A)  -  y ( A) | . 
n  AeA  n 

Here  we  assume  that  D(A,y),  supjy  (A)  -  y9  (A)|  and  sup  yn(A)  are  all  random 

n  AeA  n  AeA  n 

variables.  We  have  the  following. 


Lemma  2.  Let  A  be  a  V-C  class  with  index  s  such  that 
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sup  y  (A)  <  6  <  1/8.  (26) 

A6A 

Then  for  any  e  >  0  we  have 

P{Dn(A,y)  >  e}  <  5(2n)Sexp(-  ne2/(9U  +  4e))  (27) 

+  7(2n)Sexp(-6n/68) 

+  22+sn1+2sexp(-6n/8), 

provided  n  >  max  (12o/e2,  68(1  +  s)(log  2)/6). 

Proof.  See  [9]. 

In  the  present  case,  we  should  take  A  as  some  interval  class  in  Rd 
which  is  a  V-C  class  obviously.  Also,  there  is  no  problem  with  measur¬ 
ability  mentioned  above.  As  an  alternative  lemma,  we  can  also  use  the 
corollary  2.9  in  [1]. 


Acknowledgement.  The  authors  are  grateful  to  the  referee  for  his  helpful 


suggestions . 
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